Sparse Matrix Computations on Parallel Processor Arrays
نویسندگان
چکیده
We investigate the balancing of distributed compressed storage of large sparse matrices on a massively parallel computer. For fast computation of matrix{vector and matrix{matrix products on a rectangular processor array with e cient communications along its rows and columns we require that the nonzero elements of each matrix row or column be distributed among the processors located within the same array row or column, respectively. We construct randomized packing algorithms with such properties, and we prove that with high probability they produce well{balanced storage for su ciently large matrices with bounded number of nonzeros in each row and column, but no other restrictions on structure. Then we design basic matrix{vector multiplication routines with fully parallel interprocessor communications and intraprocessor gather and scatter operations. Their e ciency is demonstrated on the 16,384{processor MasPar computer.
منابع مشابه
Support for Irregular Computations in Massively Parallel PIM Arrays, Using an Object-Based Execution Model
The emergence of semiconductor fabrication technology allowing a tight coupling between high-density DRAM and CMOS logic on the same chip has led to the important new class of Processor-In-Memory (PIM) architectures. Furthermore, large arrays of PIMs can be arranged into massively parallel architectures. In this paper, we outline the salient features of PIM architectures and discuss macroserver...
متن کاملEfficient Multicore Sparse Matrix-Vector Multiplication for Finite Element Electromagnetics on the Cell-BE processor
Multicore systems are rapidly becoming a dominant industry trend for accelerating electromagnetics computations, driving researchers to address parallel programming paradigms early in application development. We present a new sparse representation and a two level partitioning scheme for efficient sparse matrix-vector multiplication on multicore systems, and show results for a set of finite elem...
متن کاملComputations with symmetric, positive definite and band matrices on a parallel vector processor
Computations involving symmetric, positive definite and band matrices are kernel operations in the numerical treatment of many models arising in science and engineering. It is desirable to achieve a high level of performance when such operations are to be carried out on a vector processor. If the operations are performed by rows or columns (as in the EXTENDED BLAS subroutines), then the loops a...
متن کاملApproximations for the General Block Distribution of a Matrix
The general block distribution of a matrix is a rectilinear partition of the matrix into orthogonal blocks such that the maximum sum of the elements within a single block is minimized. This corresponds to partitioning the matrix onto parallel processors so as to minimize processor load while maintaining regular communication patterns. Applications of the problem include various parallel sparse ...
متن کاملData and Computation Abstractions for Dynamic and Irregular Computations
Effective data distribution and parallelization of computations involving irregular data structures is a challenging task. We address the twin-problems in the context of computations involving block-sparse matrices. The programming model provides a global view of a distributed block-sparse matrix. Abstractions are provided for the user to express the parallel tasks in the computation. The tasks...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- SIAM J. Scientific Computing
دوره 14 شماره
صفحات -
تاریخ انتشار 1993